-
Notifications
You must be signed in to change notification settings - Fork 203
ci: verify ecp-elastic-agent-service image is multiarch #11498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ci: verify ecp-elastic-agent-service image is multiarch #11498
Conversation
|
This pull request does not have a backport label. Could you fix it @rubenruizdegauna? 🙏
|
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
v1v
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some minor comments with using container agents for faster builds and CODEOWNERS to help with notifying the corresponding team, instead the robots team
| - label: ":docker: Validate docker image is built for all architectures" | ||
| command: ".buildkite/scripts/steps/validate-agentless-docker-image.sh" | ||
| agents: | ||
| provider: "gcp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
IIUC, we use this provider since it requires using Docker in it. Wonder whether we could use a container agent here with docker on it? like https://hub.docker.com/_/docker
I have not tested btw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, but I wasn't sure which one to use.
My first approach was to try docker.elastic.co/ci-agent-images/quality-gate-seedling and docker.elastic.co/ci-agent-images/platform-ingest/buildkite-agent-beats-ci-with-hooks:0.5 but none of the have docker.
Then reading the docs I found this
https://docs.elastic.dev/ci/getting-started-with-buildkite-at-elastic#run-docker-in-a-build
Since container management is not supported in the Kubernetes offering, in order to use docker-like commands, you need to use a GCP virtual machine agent.
So I understood that this was the correct way.
Is it ok to use image that are not from docker.elastic.co?
If that's the case, I can give it a try to https://hub.docker.com/_/docker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly prefer if we used something from docker.elastic.co (reasons also include failure domains etc). I have yet to find any image of ours that have docker-in-docker in it.
In this case though, we don't technically need docker, but any of the manifest tooling could achieve this, e.g using oras (going from bash/oras/yq to pseudo-code so bear with me):
$ image_archs=$(oras manifest fetch docker.elastic.co/observability-ci/ecp-elastic-agent-service:git-b1d4e25c181a | yq '[.manifests[].platform.architecture] | sort | join(", ")')
$ echo $image_archs
amd64, arm64
# assert image_archs == amd64, arm64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified to use custom image docker.elastic.co/ci-agent-images/observability/oci-image-tools-agent
(created in https://github.com/elastic/ci-agent-images/pull/2052)
| @@ -0,0 +1,111 @@ | |||
| #!/usr/bin/env bash | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we add an entry in the CODEOWNERS with the team responsible for this script and probably .buildkite/pipeline.agentless-tests.yaml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Added in 36c8137
| setup_extract_sha() { | ||
| # Ensure repo is available - redirect output to /dev/null | ||
| if [ ! -d "serverless-gitops" ]; then | ||
| git clone --depth 1 [email protected]:elastic/serverless-gitops.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you confirm whether this is accessible in BK? I cannot find the relevant GH check for testing this PR Maybe I'm blind. Do you mind pointing where I can see it? 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was inspired by https://github.com/elastic/elastic-agent/blob/main/.buildkite/scripts/steps/run-agentless-tests.sh#L26 which is triggered from the same bk job, so permissions should be correct.
for testing this PR
TBH I haven't tested this in the pipeline. Just locally assuming that the permissions will work as the same checkout is done in the agentless-tests.
But I can create a branch instead of a fork and trigger a test :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ok, I assume both scripts are triggered in the same bk pipeline.
I somehow thought it runs automatically for forked PRs too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a good idea in any case, as I found a couple of issues (yq not present and missing wait between steps), so thanks for the ping ;-)
ycombinator
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to unblock PR from merging (needs approval from @elastic/elastic-agent-control-plane until the CODEOWNERS changes in this PR take effect) but will defer to someone from @elastic/ingest-managed-jobs for a second set of eyes.
swiatekm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'm not overly enthused about this being a bash script, but tying it into agent build tooling would probably be worse in the long run.
| setup_extract_sha | ||
|
|
||
| # Extract the SHA for the specified environment | ||
| COMMIT_SHA=$(extract_sha "$ENVIRONMENT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can replace everything up to this point with ${SERVICE_VERSION}?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I checked the BK env vars and SERVICE_VERSION is there so I removed the code that was retrieving it.
| - label: ":docker: Validate docker image is built for all architectures" | ||
| command: ".buildkite/scripts/steps/validate-agentless-docker-image.sh" | ||
| agents: | ||
| provider: "gcp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly prefer if we used something from docker.elastic.co (reasons also include failure domains etc). I have yet to find any image of ours that have docker-in-docker in it.
In this case though, we don't technically need docker, but any of the manifest tooling could achieve this, e.g using oras (going from bash/oras/yq to pseudo-code so bear with me):
$ image_archs=$(oras manifest fetch docker.elastic.co/observability-ci/ecp-elastic-agent-service:git-b1d4e25c181a | yq '[.manifests[].platform.architecture] | sort | join(", ")')
$ echo $image_archs
amd64, arm64
# assert image_archs == amd64, arm64
f6432ab
💚 Build Succeeded
History
|
This pull request adds a new validation step to the Buildkite pipeline to ensure that the agentless Docker image is built for all required architectures before running service tests. The main change is the introduction of a script that checks the presence of both
amd64andarm64architectures in the Docker image manifest for the specified environment.Pipeline enhancements:
.buildkite/pipeline.agentless-tests.yamlto run the validation script before service tests.Docker image validation:
.buildkite/scripts/steps/validate-agentless-docker-image.sh, a script that:amd64andarm64architectures are present, failing the build if any are missing.